Exemplar-based Robust Coherent Biclustering

نویسندگان

  • Kewei Tu
  • Xixiu Ouyang
  • Dingyi Han
  • Vasant Honavar
چکیده

The biclustering, co-clustering, or subspace clustering problem involves simultaneously grouping the rows and columns of a data matrix to uncover biclusters or sub-matrices of the data matrix that optimize a desired objective function. In coherent biclustering, the objective function contains a coherence measure of the biclusters. We introduce a novel formulation of the coherent biclustering problem and use it to derive two algorithms. The first algorithm is based on loopy message passing; and the second relies on a greedy strategy yielding an algorithm that is significantly faster than the first. A distinguishing feature of these algorithms is that they identify an exemplar or a prototypical member of each bi-cluster. We note the interference from background elements in bi-clustering, and offer a means to circumvent such interference using additional regularization. Our experiments with synthetic as well as real-world datasets show that our algorithms are competitive with the current stateof-the-art algorithms for finding coherent bi-clusters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GFBA: A Biclustering Algorithm for Discovering Value-Coherent Biclusters

Clustering has been one of the most popular approaches used in gene expression data analysis. A clustering method is typically used to partition genes according to their similarity of expression under different conditions. However, it is often the case that some genes behave similarly only on a subset of conditions and their behavior is uncorrelated over the rest of the conditions. As tradition...

متن کامل

Extending the definition of beta-consistent biclustering for feature selection

Consistent biclusterings of sets of data are useful for solving feature selection and classification problems. The problem of finding a consistent biclustering can be formulated as a combinatorial optimization problem, and it can be solved by the employment of a recently proposed VNS-based heuristic. In this context, the concept of β-consistent biclustering has been introduced for dealing with ...

متن کامل

A Binary Factor Graph Model for Biclustering

Biclustering, which can be defined as the simultaneous clustering of rows and columns in a data matrix, has received increasing attention in recent years, particularly in the field of Bioinformatics (e.g. for the analysis of microarray data). This paper proposes a novel biclustering approach, which extends the Affinity Propagation [1] clustering algorithm to the biclustering case. In particular...

متن کامل

Evolutionary Biclustering of Clickstream Data

Biclustering is a two way clustering approach involving simultaneous clustering along two dimensions of the data matrix. Finding biclusters of web objects (i.e. web users and web pages) is an emerging topic in the context of web usage mining. It overcomes the problem associated with traditional clustering methods by allowing automatic discovery of browsing pattern based on a subset of attribute...

متن کامل

DNA Microarray Data Analysis: A Novel Biclustering Algorithm Approach

Biclustering algorithms refer to a distinct class of clustering algorithms that perform simultaneous row-column clustering. Biclustering problems arise in DNAmicroarray data analysis, collaborative filtering, market research, information retrieval, text mining, electoral trends, exchange analysis, and so forth. When dealing with DNA microarray experimental data for example, the goal of bicluste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011